Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient Tensor conversion from list of numpy arrays #1071

Merged
merged 1 commit into from
Nov 13, 2023

Conversation

josalhor
Copy link
Contributor

While playing around with pomegranate v1 I run into this warning:

/home/josalhor/Desktop/data-synt/venv/lib/python3.11/site-packages/pomegranate/_utils.py:62: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:261.)

I think my pull requests fixes that warning and should speed up the casting.

@jmschrei
Copy link
Owner

Would you mind undoing the version bump? Usually I'll only do a bump after a few changes, and have a corresponding release when I do it.

@josalhor
Copy link
Contributor Author

Absolutely no problem. Actually, I did the version bump because GitHub Actions reported a cancellation but no failure and wanted to retest. Will undo the commit.

@jmschrei
Copy link
Owner

Thanks!

@jmschrei jmschrei merged commit a446005 into jmschrei:master Nov 13, 2023
3 of 6 checks passed


if isinstance(value, list) and all(isinstance(v, numpy.ndarray) for v in value):
value = numpy.array(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to have a return statement in all cases?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's saying that if value is a list of numpy arrays, cast value as a single numpy array. L63 will then take that new numpy array and operate on it, but more efficiently than if it were a list of arrays.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh!

@josalhor
Copy link
Contributor Author

Did this PR break the build?

@jmschrei
Copy link
Owner

No. There are some unit tests that stochastically raise NaNs. When I manually re-run them they run successfully. 100% of the time these same tests run successfully on my local machine so I'm not sure what their source is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants